A Penetration Method for UAV Based on Distributed Reinforcement Learning and Demonstrations

نویسندگان

چکیده

The penetration of unmanned aerial vehicles (UAVs) is an essential and important link in modern warfare. Enhancing UAV’s ability autonomous through machine learning has become a research hotspot. However, the current generation strategies for UAVs faces problem excessive sample demand. To reduce demand, this paper proposes combination policy (CPL) algorithm that combines distributed reinforcement demonstrations. Innovatively, action CPL jointly determined by initial obtained from demonstrations target asynchronous advantage actor-critic network, thus retaining guiding role training. In complex unknown dynamic environment, 1000 training experiments 500 test were conducted related baseline algorithms. results show smallest highest convergence efficiency, success rate among all algorithms, strong robustness environments.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning and Control of UAV Maneuvers Based on Demonstrations

Many maneuvers of Unmanned Aerial Vehicles (UAV) can be considered within a framework of trajectory following. Though this issue can differ from one application to another, they all share the same problem of finding an optimal path (or signal) to perform the specified task. Finding this optimal trajectory is a challenging issue since it depends on both having an accurate mathematical model of t...

متن کامل

Improving Reinforcement Learning with Confidence-Based Demonstrations

Reinforcement learning has had many successes, but in practice it often requires significant amounts of data to learn high-performing policies. One common way to improve learning is to allow a trained (source) agent to assist a new (target) agent. The goals in this setting are to 1) improve the target agent’s performance, relative to learning unaided, and 2) allow the target agent to outperform...

متن کامل

Reinforcement Learning with Multiple Demonstrations

Many tasks in robotics can be described as a trajectory that the robot should follow. Unfortunately, specifying the desired trajectory is often a non-trivial task. For example, when asked to describe the trajectory that a helicopter should follow to perform an aerobatic flip, one would have to not only (a) specify a complete trajectory in state space that intuitively corresponds to the aerobati...

متن کامل

Reinforcement Learning from Imperfect Demonstrations

Robust real-world learning should benefit from both demonstrations and interaction with the environment. Current approaches to learning from demonstration and reward perform supervised learning on expert demonstration data and use reinforcement learning to further improve performance based on reward from the environment. These tasks have divergent losses which are difficult to jointly optimize;...

متن کامل

Dynamic Obstacle Avoidance by Distributed Algorithm based on Reinforcement Learning (RESEARCH NOTE)

In this paper we focus on the application of reinforcement learning to obstacle avoidance in dynamic Environments in wireless sensor networks. A distributed algorithm based on reinforcement learning is developed for sensor networks to guide mobile robot through the dynamic obstacles. The sensor network models the danger of the area under coverage as obstacles, and has the property of adoption o...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Drones

سال: 2023

ISSN: ['2504-446X']

DOI: https://doi.org/10.3390/drones7040232